• OpenAccess
    • List of Articles Data Mining

      • Open Access Article

        1 - A method for clustering customers using RFM model and grey numbers in terms of uncertainty
        azime mozafari
        The purpose of this study is presentation a method for clustering bank customers based on RFM model in terms of uncertainty. According to the proposed framework in this study after determination the parameter values of the RFM model, including recently exchange (R), fre More
        The purpose of this study is presentation a method for clustering bank customers based on RFM model in terms of uncertainty. According to the proposed framework in this study after determination the parameter values of the RFM model, including recently exchange (R), frequency exchange (F), and monetary value of the exchange (M), grey theory is used to eliminate the uncertainty and customers are segmented using a different approach. Thus, bank customers are clustered to three main segments called good, ordinary and bad customers. After cluster validation using Dunn index and Davis Bouldin index, properties of customers are detected in any of the segments. Finally, recommendations are offered to improve customer relationship management system. Manuscript profile
      • Open Access Article

        2 - Proposing a Model for Extracting Information from Textual Documents, Based on Text Mining in E-learning
        Somayeh Ahari
        As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that disco More
        As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. Text mining aims at disclosing the concealed information by means of methods which on the one hand are able to cope with the large number of words and structures in natural language and on the other hand allow handling vagueness, uncertainty and fuzziness. Text mining, referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text that high-quality information is typically derived through the patterns and processes. Moreover, text mining, also known as text data mining or knowledge discovery from textual databases, refers to the process of extracting patterns or knowledge from text documents. In this research, a survey of text mining techniques and applications in e-learning has been presented. During these studies, relevant researches in the field of e-learning were classified. After classification of researches, related problems and solutions were extracted. In this paper, first, definition of text mining is presented. Then, the process of text mining and its applications in e-learning domain are described. Furthermore, text mining techniques are introduced, and each of these methods in the field of e-learning is considered. Finally, a model for the information extraction by text mining techniques in e-learning domain is proposed. Manuscript profile
      • Open Access Article

        3 - Integrating Data Envelopment Analysis and Decision Tree Models in Order to Evaluate Information Technology-Based Units
        Amir Amini ali alinezhad somaye shafaghizade
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data enve More
        In order to evaluate the performance and desirability of the activities of its units each organization needs an evaluation system to assess this desirability and it is more important for financial institutions, including information technology-based companies. Data envelopment analysis (DEA) is a non-parametric method to measure the effectiveness and efficiency of decision-making units (DMUs). On the other hand, data mining technique allows DMUs to explore and discover meaningful information, which had previously been hidden in large databases. . This paper presents a general framework for combining DEA and regression tree for evaluating the effectiveness and efficiency of the DMUs. Resulting hybrid model is a set of rules that can be used by policy makers to discover reasons behind efficient and inefficient DMUs. Using the proposed method for examining factors related to productivity, a sample of 18 branches of Iran insurance in Tehran was elected as a case study. After modeling based on advanced model the input oriented LVM model with weak disposability in data envelopment analysis was calculated using undesirable output, and by use of decision tree technique deals with extracting and discovering the rules for the cause of increased productivity and reduced productivity. Manuscript profile
      • Open Access Article

        4 - Provide a method for customer segmentation using the RFM model in conditions of uncertainty
        mohammadreza gholamian azime mozafari
        The purpose of this study is to provide a method for customer segmentation of a private bank in Shiraz based on the RFM model in the face of uncertainty about customer data. In the proposed framework of this study, first, the values ​​of RFM model indicators including e More
        The purpose of this study is to provide a method for customer segmentation of a private bank in Shiraz based on the RFM model in the face of uncertainty about customer data. In the proposed framework of this study, first, the values ​​of RFM model indicators including exchange novelty (R), number of exchanges (F) and monetary value of exchange (M) were extracted from the customer database and preprocessed. Given the breadth of the data, it is not possible to determine the exact number to determine whether the customer is good or bad; Therefore, to eliminate this uncertainty, the gray number theory was used, which considers the customer's situation as a range. In this way, using a different method, the bank's customers were segmented, which according to the results, customers were divided into three main sections or clusters as good, normal and bad customers. After validating the clusters using Don and Davis Boldin indicators, customer characteristics in each sector were identified and at the end, suggestions were made to improve the customer relationship management system. Manuscript profile
      • Open Access Article

        5 - An Improved Method for Detecting Phishing Websites Using Data Mining on Web Pages
        mahdiye baharloo Alireza Yari
        Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is More
        Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is regarded as one of the important prerequisites in designing an accurate detection system. Therefore, in order to detect phishing features, a list of 30 features suggested by phishing websites was first prepared. Then, a two-stage feature reduction method based on feature selection and extraction were proposed to enhance the efficiency of phishing detection systems, which was able to reduce the number of features significantly. Finally, the performance of decision tree J48, random forest, naïve Bayes methods were evaluated{cke_protected_1}{cke_protected_2}{cke_protected_3}{cke_protected_4} on the reduced features. The results indicated that accuracy of the model created to determine the phishing websites by using the two-stage feature reduction based Wrapper and Principal Component Analysis (PCA) algorithm in the random forest method of 96.58%, which is a desirable outcome compared to other methods. Manuscript profile
      • Open Access Article

        6 - Presenting the model for opinion mining at the document feature level for hotel users' reviews
        ELHAM KHALAJJ shahriyar mohammadi
        Nowadays, online review of user’s sentiments and opinions on the Internet is an important part of the process of people deciding whether to choose a product or use the services provided. Despite the Internet platform and easy access to blogs related to opinions in the More
        Nowadays, online review of user’s sentiments and opinions on the Internet is an important part of the process of people deciding whether to choose a product or use the services provided. Despite the Internet platform and easy access to blogs related to opinions in the field of tourism and hotel industry, there are huge and rich sources of ideas in the form of text that people can use text mining methods to discover the opinions of. Due to the importance of user's sentiments and opinions in the industry, especially in the tourism and hotel industry, the topics of opinion research and analysis of emotions and exploration of texts written by users have been considered by those in charge. In this research, a new and combined method based on a common approach in sentiment analysis, the use of words to produce characteristics for classifying reviews is presented. Thus, the development of two methods of vocabulary construction, one using statistical methods and the other using genetic algorithm is presented. The above words are combined with the Vocabulary of public feeling and standard Liu Bing classification of prominent words to increase the accuracy of classification Manuscript profile
      • Open Access Article

        7 - Design and implementation of a survival model for patients with melanoma based on data mining algorithms
        farinaz sanaei Seyed Abdollah  Amin Mousavi Abbas Toloie Eshlaghy ali rajabzadeh ghotri
        Background/Purpose: Among the most commonly diagnosed cancers, melanoma is the second leading cause of cancer-related death. A growing number of people are becoming victims of melanoma. Melanoma is also the most malignant and rare form of skin cancer. Advanced cases of More
        Background/Purpose: Among the most commonly diagnosed cancers, melanoma is the second leading cause of cancer-related death. A growing number of people are becoming victims of melanoma. Melanoma is also the most malignant and rare form of skin cancer. Advanced cases of the disease may cause death due to the spread of the disease to internal organs. The National Cancer Institute reported that approximately 99,780 people were diagnosed with melanoma in 2022, and approximately 7,650 died. Therefore, this study aims to develop an optimization algorithm for predicting melanoma patients' survival. Methodology: This applied research was a descriptive-analytical and retrospective study. The study population included patients with melanoma cancer identified from the National Cancer Research Center at Shahid Beheshti University between 2008 and 2013, with a follow-up period of five years. An optimization model was selected for melanoma survival prognosis based on the evaluation metrics of data mining algorithms. Findings: A neural network algorithm, a Naïve Bayes network, a Bayesian network, a combination of decision tree and Naïve Bayes network, logistic regression, J48, and ID3 were selected as the models used in the national database. Statistically, the studied neural network outperformed other selected algorithms in all evaluation metrics. Conclusion: The results of the present study showed that the neural network with a value of 0.97 has optimal performance in terms of reliability. Therefore, the predictive model of melanoma survival showed a better performance both in terms of discrimination power and reliability. Therefore, this algorithm was proposed as a melanoma survival prediction model. Manuscript profile
      • Open Access Article

        8 - Presenting a web recommender system for user nose pages using DBSCAN clustering algorithm and machine learning SVM method.
        reza molaee fard Mohammad mosleh
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can s More
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can solve the problem of the cold start of the system and improve the quality of the search. In this research, a new method is presented in order to improve recommender systems in the field of the web, which uses the DBSCAN clustering algorithm to cluster data, and this algorithm obtained an efficiency score of 99%. Then, using the Page rank algorithm, the user's favorite pages are weighted. Then, using the SVM method, we categorize the data and give the user a combined recommender system to generate predictions, and finally, this recommender system will provide the user with a list of pages that may be of interest to the user. The evaluation of the results of the research indicated that the use of this proposed method can achieve a score of 95% in the recall section and a score of 99% in the accuracy section, which proves that this recommender system can reach more than 90%. It detects the user's intended pages correctly and solves the weaknesses of other previous systems to a large extent. Manuscript profile
      • Open Access Article

        9 - Anomaly and Intrusion Detection Through Data Mining and Feature Selection using PSO Algorithm
        Fereidoon Rezaei Mohamad Ali Afshar Kazemi Mohammad Ali Keramati
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, th More
        Today, considering technology development, increased use of Internet in businesses, and movement of business types from physical to virtual and internet, attacks and anomalies have also changed from physical to virtual. That is, instead of thieving a store or market, the individuals intrude the websites and virtual markets through cyberattacks and disrupt them. Detection of attacks and anomalies is one of the new challenges in promoting e-commerce technologies. Detecting anomalies of a network and the process of detecting destructive activities in e-commerce can be executed by analyzing the behavior of network traffic. Data mining systems/techniques are used extensively in intrusion detection systems (IDS) in order to detect anomalies. Reducing the size/dimensions of features plays an important role in intrusion detection since detecting anomalies, which are features of network traffic with high dimensions, is a time-consuming process. Choosing suitable and accurate features influences the speed of the proposed task/work analysis, resulting in an improved speed of detection. In this article, by using data mining algorithms such as Bayesian, Multilayer Perceptron, CFS, Best First, J48 and PSO, we were able to increase the accuracy of detecting anomalies and attacks to 0.996 and the error rate to 0.004. Manuscript profile